当前位置: 首页 > 期刊 > 《基因杂志》 > 2003年第1期 > 正文
编号:10585741
Molecular Dissection of a Quantitative Trait Locus: A Phenylalanine-to-Tyrosine Substitution in the Transmembrane Domain of the Bovine Growth Hormone
http://www.100md.com 《基因杂志》2003年第1期
     a Department of Genetics, Faculty of Veterinary Medicine, University of Liège, 4000 Liège, Belgium,b Animal Production Research, Agricultural Research Centre MTT, 31600 Jokioinen, Finland,c Livestock Improvement Corporation, Hamilton, New Zealand1p, http://www.100md.com

    d Vialactia Biosciences (NZ) Limited, Clinical Building, University of Auckland Medical School, Auckland, New Zealand1p, http://www.100md.com

    ABSTRACT1p, http://www.100md.com

    We herein report on our efforts to improve the mapping resolution of a QTL with major effect on milk yield and composition that was previously mapped to bovine chromosome 20. By using a denser chromosome 20 marker map and by exploiting linkage disequilibrium using two distinct approaches, we provide strong evidence that a chromosome segment including the gene coding for the growth hormone receptor accounts for at least part of the chromosome 20 QTL effect. By sequencing individuals with known QTL genotype, we identify an F to Y substitution in the transmembrane domain of the growth hormone receptor gene that is associated with a strong effect on milk yield and composition in the general population.

    WITH the development of genome-wide marker maps for several species it is becoming possible to map quantitative trait loci (QTL) underlying the genetic variation for continuously distributed phenotypes of medical and agronomical importance (e.g., ANDERSSON 2001 ; FLINT and MOTT 2001 ; MACKAY 2001 ; MAURICIO 2001 ). The mapping resolution that is achieved with most experimental designs, however, is in the tens of centimorgans at best, therefore precluding efficient marker-assisted selection let alone positional cloning of the corresponding genes.vca, 百拇医药

    When working with model organisms or plants, strategies to improve the mapping resolution most often require breeding of a large number of progeny to increase the density of crossovers in the chromosome regions of interest (e.g., DARVASI 1998 ). When working with humans or farm animals, this approach is not practical and alternative strategies need to be identified. One approach that has recently received considerable attention is linkage disequilibrium (LD) mapping, which aims at exploiting historical recombinants. However, as useful LD is expected to extend only over limited distances (e.g., 60 kb in the human; REICH et al. 2001 ) this approach requires a commensurate increase in marker density.

    In some livestock populations, including dairy cattle, LD has been shown to extend over very long chromosome segments when compared to human populations (FARNIR et al. 2000 ). This has raised hope that LD would be readily exploitable in these species using the presently available medium density maps (e.g., KAPPES et al. 1997 ). These hopes have been supported by a series of initial, promising results dealing with both simple (e.g., CHARLIER et al. 1996 ) and complex inherited traits (e.g., GRISART et al. 2002 . Potential downsides associated with this long-range LD are (i) a possible limited mapping resolution and (ii) the occurrence of association in the absence of linkage due to gametic association between nonsyntenic loci.#;, http://www.100md.com

    We herein use LD to improve our molecular understanding of a QTL influencing milk yield and composition that was previously mapped to bovine chromosome 20 (GEORGES et al. 1995 ; ARRANZ et al. 1998 ).#;, http://www.100md.com

    MATERIALS AND METHODS#;, http://www.100md.com

    Pedigree material:

    The pedigree material used in this study was composed of the following::, 百拇医药

    Data set I: a previously described Black-and-White Holstein-Friesian granddaughter design (GDD) sampled in the Netherlands and composed of 22 paternal half-sib families for a total of 987 bulls (SPELMAN et al. 1996 ; COPPIETERS et al. 1998A ).:, 百拇医药

    Data set II: 276 progeny-tested Holstein-Friesian sires sampled in the Netherlands.:, 百拇医药

    Data set III: 1550 progeny-tested Holstein-Friesian sires sampled in New Zealand.:, 百拇医药

    Data set IV: 959 progeny-tested Jersey sires sampled in New Zealand.:, 百拇医药

    Data set V: 485 Holstein-Friesian cows sampled in New Zealand.:, 百拇医药

    Data set VI: 387 Jersey cows sampled in New Zealand.:, 百拇医药

    Phenotypes::, 百拇医药

    Phenotypes were, respectively, daughter yield deviations (DYD) for bulls, lactation values (LV, the unregressed first lactation yield deviations) for cows, average parental predicted transmitting abilities (PTA) for bulls and cows for milk, protein, and fat yield, as well as protein and fat percentage (VAN RADEN and WIGGANS 1991 ). DYDs, lactation values, and PTA were obtained directly from CR-DELTA (Netherlands; data sets I and II) or Livestock Improvement Corporation (LIC, New Zealand; data sets III–VI), respectively.

    Map construction:81, 百拇医药

    Microsatellite genotyping, map construction, and information content mapping were performed as previously described (COPPIETERS et al. 1998A ). Sequence information for the primers used for PCR amplification of anonymous type II microsatellite markers can be obtained from ArkDB (). The following primers were designed on the basis of HEAP et al. 1995 to amplify a microsatellite in the promotor region of the growth hormone receptor gene: GHRJA.UP, 5'-TGCTCTAATCTTTTCTGGTACCAGG-3', and GHRJA.DN, 5'-TCCTCCCCAAATCAATTACATTTTCTC-3'.81, 百拇医药

    Conventional QTL mapping:81, 百拇医药

    QTL mapping was performed by multimarker regression (KNOTT et al. 1996 ) using the previously described HSQM software (COPPIETERS et al. 1998B ). Chromosome-wide significance thresholds were determined by permutation as previously described (CHURCHILL and DOERGE 1995 ; COPPIETERS et al. 1998B ). Segregating sire families were identified on the basis of the results of within-family analyses as previously described (COPPIETERS et al. 1998A ).

    Haplotype-based test for association:c$-&'], http://www.100md.com

    Assumptions: Assume a QTL that is characterized by two additively acting alleles, Q and q, that segregate in the population of interest with respective allelic frequencies of fQ and (1 - fQ). Assume that the Q allele appeared in the population by mutation or migration on a chromosome with haplotype "H" for a series of flanking markers. All other haplotypes are pooled and referred to as "O." At the present generation the H haplotype may still be in LD with the Q allele by an amount D. The H to O haplotype substitution effect can then be shown to equalc$-&'], http://www.100md.com

    where a corresponds to one-half the difference between the phenotypic values of QQ vs. qq individuals, and fH corresponds to the population frequency of the H haplotype (FALCONER and MACKAY 1996 ).c$-&'], http://www.100md.com

    Test for association: Knowing that in our GDD, phased-marker genotypes are available for all sons and their sires but not their dams as these are not marker genotyped, and defining Ti as [DYDi - PAi], where DYDi is the daughter yield deviation of son i and PAi is the average predicted transmitting ability (VAN RADEN and WIGGANS 1991 ) of the sire and dam of son i, one can express the expected value of Ti as a function of the marker genotype of the sire's chromosomes (SC) and the marker genotypes of the paternal (PC) and maternal gametes (MC) inherited by son i, as shown in 1.

    fig.ommitteed?4[x:, http://www.100md.com

    Table 1. Expected values of T (= DYD - PA) as a function of the marker genotype of the sire and the marker genotypes of the paternal and maternal gametes inherited by the son?4[x:, http://www.100md.com

    Expected values of Ti can be seen to be linear functions of the unknown haplotype substitution effect, {alpha} . A least-squares estimator of {alpha} can therefore easily be obtained by linear regression, while the ratio?4[x:, http://www.100md.com

    which is distributed as an F-statistic with 1 and n - 2 d.f., can be used to measure the evidence in favor of a statistically significant haplotype substitution effect. In this n corresponds to the number of sons available in the GDD, SSE to the residual sum of squares, and SSR to the regression sum of squares.?4[x:, http://www.100md.com

    By using Ti as phenotype, one essentially performs a transmission disequilibrium test (TDT; SPIELMAN et al. 1993 ), which simultaneously tests for association and linkage. As the dams are not genotyped, however, our TDT reduces in part to a conventional association test.

    Choice of markers and haplotypes: So far, we have not defined which of the m markers available on the chromosome have to be considered when defining a haplotype. As the exact location of the QTL and the size of the haplotype that will maximize {alpha} are both unknown, all possible windows comprising between one and m adjacent markers are tested separately. We thus examine m windows of one marker, (m - 1) windows of two markers, (m - 2) windows of three markers, ... , and one window of m markers.:$gqrpv, http://www.100md.com

    Having selected the markers comprising the haplotype, one still has to choose the H haplotype among all haplotypes encountered in the population. In the proposed approach, the haplotypes that were successively considered as H haplotypes correspond to the chromosomes of the s sires in the GDD that were known to be heterozygous Qq for the QTL on the basis of the results of a marker-assisted segregation analysis performed on their sons (see above). As one does not a priori know which of the sire's homologs carries the Q allele, the haplotypes corresponding to both chromosomes are examined, for a total of 2s homologs.

    When estimating the substitution effect of the haplotypes of a given sire, his sons were eliminated from the data set, to avoid extracting information that would be redundant with the linkage analysis.|3, http://www.100md.com

    Significance thresholds: The F-ratio defined above does not account for the multiple tests that are performed, i.e., the (m2 + m)/2 marker windows tested for each of the 2s homologs. We accounted for multiple testing by applying a permutation test. The phenotypes and marker genotypes were shuffled 1000 times and the 2s(m2 + m)/2 tests were performed on each permutated data set. The highest F-ratios obtained with the real data were then compared with the highest F-ratios obtained across the 1000 permutations.|3, http://www.100md.com

    Simultaneous mining of linkage and linkage disequilibrium:|3, http://www.100md.com

    QTL fine-mapping exploiting both linkage and LD: The utilized mapping method is implemented in the LD variance component mapping (LDVCM) programs and can be summarized as follows. Testing for the presence of a QTL at map position p of the studied chromosome was performed as follows:

    For all markers on the studied chromosome, we determine the marker linkage phase of the sires and sons as described (FARNIR et al. 2002 ). As a consequence, the marker data then consist of 2s sire chromosomes (SC), n paternally inherited chromosomes of the sons (PC), and n maternally inherited chromosomes of the sons (MC), where s and n correspond, respectively, to the number of sire families and the number of sons in the GDD. From the genotypes of the PC, we can easily compute the probability that son i inherited the "left" ({lambda}tj|, http://www.100md.com

    p) or "right" ({rho} p = 1 - {lambda}tj|, http://www.100md.com

    p) SC from its sire at map position p as described (COPPIETERS et al. 1998B ).tj|, http://www.100md.com

    We compute identity-by-descent (IBD) probabilities (p) for all pairwise combinations of SC and MC using the method described by MEUWISSEN and GODDARD 2001 . This method approximates the probability that two chromosomes are IBD at a given map position conditional on the identity-by-state (IBS) status of flanking markers, on the basis of coalescent theory (HUDSON 1985 ). Windows of 16 markers were considered to compute {phi} p.

    Using (1 - p) as a distance measure, we apply the UPGMA hierarchical clustering algorithm (e.g., MOUNT 2001 ) to generate a rooted dendrogram representing the genetic relationship—at position p—between all SC and MC haplotypes encountered in the population.il(hm, 百拇医药

    We use the logical framework provided by this dendrogram to group the SC and MC in functionally distinct clusters. A cluster is defined as a group of haplotypes that coalesce into a common node. A useful feature of UPGMA trees in this regard is that the distance (1 - p) between all the haplotypes that coalesce into a given node is less than or equal to two times the distance between the node and any of these haplotypes. As a consequence, the tree is scanned downward from the root and branches are cut until nodes are reached such that all coalescing haplotypes (i.e., all haplotypes within the cluster) have a distance measure (1 - p) < T (KIM and GEORGES 2002 ).

    We model the sons' phenotypes (DYDs) using the following linear model:uf, 百拇医药

    y is the vector of phenotype records of all sons. b is a vector of fixed effects, which in this study reduces to the overall mean. X is the incidence matrix relating fixed effects to individual sons, which in this study reduces to a vector of ones. h is the vector of random QTL effects corresponding to the defined haplotype clusters. Zh is an incidence matrix relating haplotype clusters to individual sons. In Zh, a maximum of three elements per line can have nonzero value: "1" in the column corresponding to the cluster to which the MC haplotype belongs and {lambda}uf, 百拇医药

    p and {rho} p in the columns corresponding, respectively, to the haplotype clusters of the right and left SC. If the SC and/or MC belong to the same cluster, the corresponding coefficients are added. u is the vector of random individual polygenic effects ("animal model", LYNCH and WALSH 1997 ). Zu is a diagonal incidence matrix relating individual polygenic effects to individual sons. e is the vector of individual error terms.

    Haplotype cluster effects with corresponding variance, {sigma} 2H, individual polygenic effects with corresponding variance, {sigma} 2A, and individual error terms with corresponding variance, {sigma} 2E, were estimated using average information restricted maximum likelihood (AIREML; JOHNSON and THOMPSON 1995 ), by maximizing the restricted log-likelihood function L:'s6q9f(, http://www.100md.com

    In this,'s6q9f(, http://www.100md.com

    Because we assume that the covariance between the QTL effects of the different haplotype clusters is zero, H reduces to an identity matrix. This differentiates our approach from that of MEUWISSEN and GODDARD 2000 , in which H is the matrix of between-haplotype IBD probabilities. A is the additive genetic relationship matrix (LYNCH and WALSH 1997 ).'s6q9f(, http://www.100md.com

    Steps 4 and 5 are repeated for all possible values of T (from 0 to 1), to identify a restricted maximum likelihood (REML) solution for map position p. By analogy with FARNIR et al. 2002 , we denote the hypothesis corresponding to this REML solution as H2.

    QTL mapping exploiting linkage only: Note that the previous model can be extended with minor modifications to map QTL by exploiting linkage information only. This is simply achieved by ignoring all MCs and considering that all SCs belong to distinct haplotype clusters, irrespective of their marker genotype. REML solutions for the different parameters can be found as described in the previous section. Again by analogy with FARNIR et al. 2002 , we refer to the corresponding hypothesis as H1.&$4, 百拇医药

    Hypothesis testing and significance thresholds: The log likelihoods of the data under the H2 and H1 hypotheses are compared with that under the null hypothesis, H0, of no QTL at map position p. The latter is computed as described above but using the reduced model,&$4, 百拇医药

    Evidence in favor of a QTL at map position p can then be expressed as a lod score:&$4, 百拇医药

    As customary when performing interval mapping, we are sliding the hypothetical position of the QTL throughout the chromosome map and compute lod scores at each map position as described to generate chromosome-wide lod score profiles.

    KIM and GEORGES 2002 have shown by simulation that when analyzing a chromosome of 100 cM with a marker density of one marker every 5 cM, 2 x (LH1/2 - LH0) has (under the null hypothesis) an approximate chi-square distribution with 2 d.f. corrected (Bonferroni correction) for two and six independent tests when considering, respectively, H1 and H2. Chromosome-wide significance levels were computed from these distributions in this study.)[iz]), 百拇医药

    Sequencing the coding portion of the growth hormone receptor from genomic DNA:)[iz]), 百拇医药

    To develop primers that would allow us to conveniently amplify and sequence the entire growth hormone receptor (GHR) coding sequence from bovine genomic DNA, we screened a bovine bacterial artificial chromosome (BAC) library (WARREN et al. 2000 ) using standard procedures with an oligonucleotide probe complementary to exon 10 and isolated eight GHR-containing clones. DNA from one of these clones was used as template for sequencing the intron-exon boundaries using exonic primers designed on the basis of the bovine cDNA sequence (e.g., HAUSER et al. 1990 ) and predicted to flank exon-intron boundaries assuming conservation of intron position between human and cattle (e.g., GODOWSKI et al. 1989 ). On the basis of the obtained intronic information, we then designed primers (2) to amplify and sequence most of the GHR coding sequence from genomic DNA using standard procedures. Sequence traces were analyzed with the POLYPHRED software (NICKERSON et al. 1997 ).

    fig.ommitteed:.v, 百拇医药

    Table 2. Primers used for amplification and sequencing of the GHR exons from bovine genomic DNA:.v, 百拇医药

    Oligonucleotide ligation assay::.v, 百拇医药

    An oligonucleotide ligation assay (OLA) test to genotype the GHR F279Y, Nt864-33(T-G), Nt933+21(A-G), Nt1095(T-C), N528T, and Nt1922(C-T) single-nucleotide polymorphisms (SNPs) in multiplex was developed as previously described (KARIM et al. 2000 ). The primers used for the PCR amplification step and the ligation reaction are reported in 3.:.v, 百拇医药

    fig.ommitteed:.v, 百拇医药

    Table 3. Primers (5'–3') used for OLA multiplexing of GHR SNPs:.v, 百拇医药

    Estimating the effect on milk yield and composition associated with the F279Y polymorphism in the general dairy cattle population::.v, 百拇医药

    The effect of the F279Y genotype on milk yield and composition was estimated using the model(Sarah Blott Jong-Joo Kim Sirja Moisio Anne Schmidt-Küntzel Anne Cornet Paulette Berzi Nadine Cambis)